Measurement of work in single-molecule pulling experiments 
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I. INTRODUCTION 

In a typical single-molecule pulling experimentl^l, an in- 
dividual molecular construct is stretched by means of a 
device (e.g., optical or magnetic tweezers, atomic force 
microscope (AFM), etc.) able to measure both the ap- 
plied force, usually on the piconewton scale, and the 
end-to-end molecular extension, typically expressed in 
nanometers. Many interesting kinetic and thermody- 
namical properties^ ^ of the stretching process can be 
inferred from the resulting force-extension curve (hence- 
forth, FEC); in particular, the free energy difference be- 
tween the folded and the unfolded state can be evaluated 
by exploiting a well-known result of nonequilibrium ther- 
modynamics, the Jarzynski equalitjEl 



/3VKrcv = -log(exph/3W^(r)])r , 



(1) 



where VF(r) is the amount of work performed on the sys- 
tem throughout the stretching process F, /3 is as usual the 
inverse of the thermal energy fceT, and M^rev is the re- 
versible work, i.e., the work needed to perform the pulling 
experiment in quasi-equilibriu m co nditions. Since a sin- 
gle molecule is a small systenPIlSI^ Vr(r) is affected by 
thermal fluctuations; the angular brackets {■■■)r thus 
stand for an average over all possible realizations of the 
same experimental protocol. In fact, a generaliza tion 
of the Jarzynski equality due to Hummer and SzabcP^^ 
makes it possible to reconstruct the whole free energy 
landscape as a function of the molecular extensiorpin^lisl 
This program has been successfully applied to the exper- 
imental study of multi-domain proteins^"* 

Many a research has been devoted to the practical dif- 
ficulties that arise when Eq. ([T]) is applied to the free 
energy reconstruction problem, e.g., the bias induced by 
the finite number of experimental attempt^^^, the role 
played by the resolution of the measuring apparatud^, or 
the effect of instrument noise and experimental error 
The present article deals with yet another possible source 
of error, which, though already known, has generally 



been dismissed as negligible without a compelling argu- 
ment. The point is that in most experimental settings 
the molecular extension is not the proper control param- 
eter, so that it is not correct to interpret the area below 
the FEC as the work that appears in Eq. ([ij^^. If the 
control parameter is the total distance the area under the 
force-distance curve (FDC) should be used instead. 

Here we thoroughly analyze under which conditions 
the use of the wrong definition for the work can appre- 
ciably affect the estimate of free energy differences by 
means of Eq. ([T]). The conclusion, in a nutshell, is that 
the error induced by the substitution may be as large 
as 100%, depending on the number of experiments and 
on the data acquisition frequency. Also important are 
the details of the data analysis procedure: how the in- 
tegration extrema are chosen, what method is used to 
integrate the FEC and how different FECs are aligned to 
correct for instrumental drift effects. 

The paper is organized as follows: First, we get some 
theoretical insight by considering our problem in its sim- 
plest possible setting (Sec. |ll]). Then, we validate our 
conclusions with an experimental test implemented with 
optical tweezers and DNA hairpins (Sec. Ill I. A recapitu- 
lation of our results (Sec. IV) and an appendix with some 
technicalities round off this article. 



II. A TOY MODEL 

A detailed model for single-molecule experiments with 
optical tweezers has been discussed elsewher^. Here we 
consider a simplified version of it, that conserves only 
the physical features directly relevant to our problem. 
Although the toy model in this section is phrased in the 
optical tweezers language, it takes no effort to translate it 
into an AFM nomenclature, the mathematics being just 
the same. 

In our model, graphically depicted in Fig. [l| the opti- 
cal trap is moved by the experimenter, hence the proper 
control parameter is the trap-pipette distance A, while 
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FIG. 1: Schematic definition of the model under study. The 
pipette is at rest with respect to the thermal bath, while the 
trap is moving with velocity v. The trap and the system 
molecule + handles are approximated by two harmonic poten- 
tials with stiffness fcb and km, respectively. The rest length of 
the trap spring feb is zero, while the rest length of the molecule 
spring km is £o if the hairpin is closed (? — 0) and £i if it is 
open = 1). 



the end-to-end molecular extension is a quantity subject 
to fluctuations denoted by x. The trap is an harmonic 
potential with stiffness fcb, while k,^ is the stiffness of 
the molecular construct comprising hairpin and handles. 
Given a fixed value of the control parameter A, the state 
of the system is specified by the pair {x,<^), where is a 
label taking values if the hairpin is closed (or folded) 
and 1 if it is open (or unfolded) . The hairpin itself is a 
pure two-state systenPSl whose state-dependent length is 
The bead is thus subject to the net force 



/t(x,<j) = A;b(A - x) ~ k^{x - . 



(2) 



It is convenient to introduce the total stiffness fct = fcb + 
fcm and the equilibrium position (defined by the condition 
/t(2:cq,'^) = 0) 



Xeqi'i) 



fcbA + fcm^^ 



SO that Eq. Q can be rewritten as 

ft{x,^) = -kt[x - Xcqi^) 



(3) 



(4) 



The relaxation time of the velocity autocorrelation func- 
tion T — m/^ (to being the mass and 7 the friction coef- 
ficient of the bead in the trap) is small enough compared 
to the duration of the experiment that we can assume 
mechanical equilibriurr^^, i.e. the average value of the 
total force {ft{t)) is zero. The Hamiltonian function is 
given by 

{x, c^) = ifcb(A - x)2 + lk^{x-e,)^ + ^AGo , (5) 

where A Go is the free energy difference between the open 
and closed states of the hairpin in the absence of applied 
force. The analytic solution to the equilibrium thermo- 
dynamics of this model is summarized in App. [Xj 



The transitions of the hairpin are governed by a simpli- 
fied Kramers-Bell kinetics^ZI^ with rates for opening fc^ 
or closing fc^ given by 



fc^ = fcg exp 
fc^ = fco exp 



wofojx) 
kBT 

-wifijx) + AGq 



(6a) 
(6b) 



where wq and wi represent the distances from the barrier 
to the closed and the open states, respectively, /o and /i 
are two functions of x with physical dimensions of a force, 
and fco is the attempt frequency. The rates just defined 
must respect the detailed balance condition 



exp 



ff(^)(a;,l) -H'-^\x,0) 



(7) 



for each A and for each x. This requirement implies 

woMx) + wih{x)^^{ei-£o)[2x-{h+eo)]- (8) 

Our choice here is to take simply /o(a;) = fiix), so that 
Wo + wi ^ £1 - to- 

The dynamics of our model is ruled by the overdamped 
Langevin equation 



dx 



7^ = /t(a;(i),0 + y2^CW 



where f (t) is a Gaussian white noise 



mm) = 5{t-t'). 



(9) 



(lOa) 
(10b) 



The experimental protocol is defined by the choice of 
a function \{t). Here we consider a constant velocity 
pulling: \{t) = Aq + vt. 



A. Accumulated vs. transferred work 

For the toy model introduced in the previous section, 
A is the control parameter, which can be directly ma- 
nipulated, while the molecular extension x is subject to 
Brownian fluctuations. Therefore, the work performed 
on the system throughout a pulling experiment T that 
starts at time from A = Aj and terminates in A = Af at 
time tf ^ti + (At — \\)/v is properly defined as 



where we used Eq. ^ and /b(A,x) = fcb(A — x) is the 
force induced by the displacement of the bead in the trap. 
Such work is measured in practice as the area under the 
force-distance curve [FDC, see Fig. [2][a)]. Note that for 
all single-molecule techniques that we are aware of, /b 
is actually the only one force experimentally measurable. 
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FIG. 2: (a) A typical force-distance curve (FD C) obtained by numerical simulation of Eq. (|9|. The shaded area is equivalent 
to the accumulated work VK(r) [see Eq. (Ill], (b) The force-extension curve (EEC) associa ted to the pulling experiment 
represented in Fig. [2|a). The shaded area is equivalent to the transferred work W'{T) [see Eq. (12l]. 



In the following we will for simplicity drop the subscript 
and write / instead of /b. 

The area under the FEC [see Fig. [2|^b)], on the other 
hand, is what in Ref. (TH] is called transferred work [as 
opposed to the accumulated work VF(r)]: 



W'{T) EE / 



/(A,x) dx, 



(12) 



where Xi and X{ are the trajectory-dependent values of 
the molecular extension at times ti and if, respectively. 

At each point along the trajectory F, the control pa- 
rameter and the molecular extension are related by 



X = A — 



(13) 



This implies the following relation between the area un- 
der a FDC and the area under the corresponding FEC: 



W{T) = T4^'(r) + 



/f(r)^ - /i(r)^ 



(14) 



where /i and /f are the (trajectory-dependent) initial and 
final values of the force, respectively. The difference be- 
tween W and W is therefore a pure boundary term. 



B. The reversible work 



open and closed states of the hairpin at zero external 



force. According to Eq. (A13l, this is given by 



(/>? - if)! 



2h 



■eS 



(15) 



where (/)i(f) is the equilibrium initial (final) value of the 
force, and k^s is the effective stiffness 



1 



1 

fcb 



(16) 



The thermodynamic force-extension curv (TFEC) IS 
the quasi-equilibrium pulling experiment plotted as a 
function of the molecular extension x. If we define W' 



as the area under the TFEC, then Eq. (14) yields 



(/)? - (/>? 



(17) 



So we see that either W^cv or equally useful 

to extract the free energy of formation AGq of the hair- 
pin. The problem is that it is often unpractical (and 
sometimes impossible) to achieve quasi-equilibrium con- 
ditions. Here comes into play the Jarzynski equality, as 
we see in the next section. 



If we realize the pulling experiment in conditions of 
quasi-equilibrium, that is at infinitesimally small velocity 
V 0, then we obtain the thermodynamic force-distance 
curve (TFDC), whose analytical expression is given by 
Eq. (A5). The area under the TFDC is the reversible 
work W^rcv, equal to the free energy difference between 
the final and initial states of the system. From an exper- 
imental perspective, however, the really interesting quan- 
tity is rather the free energy difference AGq between the 



C. Jarzynski estimator 

The Jarzynski equality Eq. ([T|) gives us a recipe to 
compute the reversible work, given a suitably-sized col- 
lection of irreversible processes. The work that appears 
in Eq. ([ij is the accumulated work W^(r) defined in 
Eq. pTj ); nonetheless, in some cases it happens that 
the most readily available data for the experimenter is 
the FEC, therefore the work that is measured is in fact 
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FIG. 3: Dependence on the sample size n of the mode lj' 
of (i.e., the maximum of the distribution for see 

App. Ib|. The dimensionless variable 2 is — cj')/ {y/2a), 
where /i and a are the mean and standard deviation of the 
normally distributed transferred work w' . The represented 



curve is the numerical solution to Eq. ( B9 1 



the transferred work W^'(r) of Eq. (12 1. In such occa- 
sions, the transferred work has been used in the Jarzyn- 
ski equality, under the assumption that the resulting er- 
ror is smal l com pared to other sources of experimental 
uncertaintjl22124]^ 

In this section, we answer the following question: How 
large an error in the evaluation of AGq is made if the 
transferred work iy(r) is used instead of the accumu- 
lated work VK(r)? 

Let us call W the Jarzynski estimate of the reversible 
work Wrcv, based on n experiments that produce the set 
of work measurements {W^}: 



(3W=~ log 



1 



(18) 



The analogous quantity obtained using the transferred 
work is 



PW' = - log' 



1 



exp{-l3Wl) 



(19) 



i=l 



The quantity W is guaranteed by Eq. ([l]) to be an esti- 
mator of the reversible accumulated work Wrev, whereas 
W' is not the proper way to compute the reversible trans- 
ferred work Wj'ev (a bona fide way to estimate W^v is 
discussed in Ref. We now set out to evaluate the 
difference W — W' . 

To begin with, we sort the set {Wi} in ascending order: 

< W^(2) < M/(3) < • • • < M/(„) . (20) 
The key observation is that the sum of exponentials in 



Eq. (18 1 is dominated by the minimum work trajectory 



(21) 



Repeating the same argument for the set {W(} that col- 
lects the measured values of the transferred work, we find 



W-W ^ W(i) - W'(i) . 



(22) 



Note that the trajectory that realizes the minimum of 
{Wi} is generally not the same that gives the minimum 
of{Wl}. 

In order to go further in our analytical approximation, 
we need to specify the distributions of W and W. Based 
on our experience with both experimental and simulated 
data, we assume that W is normally distributed (see 
Fig. [s]) with mean /i and variance cr^, while for W we 
adopt a Gumbel distribution (see Fig. |9]) with parame- 
ters a and b [which are related to the average and stan- 
dard deviation of the accumulated work W by means 
of Eq. (B12l in App. IbI. This latter choice is the sim- 



plest distribution that exhibits the asymmetry we expect 
from a nonlinear system^^ (in the case of linear systems 
the work distribution is Gaussian^^ ^^). Also, there are 
theoretical arguments suggesting that the Gumbel distri- 
bution may play a universal role for correlated random 
variables similar to the one pl ayed by the Gaussian dis- 
tribution for uncorrelated one^^HHl 

We can now estimate the distribution of and 
VF'(i). The details can be found in App.[B| here we quote 
just the final result: the most likely value of T/F(i) — VF'(i) 
is approximately 



6 log 71 — /i -|- V2az(n) 



(23) 



where z(n) is the function of the sample size represented 
in Fig. ^ 

What we are really interested in, however, is AGq. If 
we put Wj-ev — W in Eq. ( 15 ) and call AGq the result of 
setting = M^' in Eq. (171, we get 



/f\2 _ /f\2 

AGq — AGq « a — 6 log rt — /i nj — . 

2Kb 

(24) 

A further simplification is possible: taking the average of 
Eq. ( 14) and using Eq. (B12 1 we are left with the formula 



AGn - AG' 



V6 



(7 - log n)s + V2z{n)s' , (25) 



where s and s' are the standard deviations of {Wi} and 
respectively, and 7 is the Euler-Mascheroni con- 
stant. 



Equation ( 25 1 states that the error in the evaluation 



of the energy properties of the hairpin due to the sub- 
stitution of {Wi} with {W^} in the Jarzynski equation 
depends on three factors: the standard deviations s and 
s', and the number of experiments n. There is a remark- 
able difference between the roles played by s and s': the 
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FIG. 4: Numerical test of Eq. (25 I. The theoretical predic- 
tion is compared to the results of numerical simulations of 
Eq. |9|. In abscissa, s' is the standard deviation of the trans- 
ferred work values {VK/}; different values of s' are obtained by 
varying the filter applied to the data. In ordinate, the error 
AGo — AGq (in k^T units) on the determination of the free 
energy of formation of the hairpin due to the erroneous use 
of W' in the Jarzynski estimator. Each point represents the 
result of the analysis of n = 9000 trajectories. 



standard deviation s of the accumulated work generally 
depends only on the pulling rate v and the chemical na- 
ture of the construct comprising molecule and handles; 
the standard deviation s' of the transferred work, on the 
other hand, is also strongly dependent on the bandwidth 
of the data acquisition system. 

The reason is easy to understand: while the area un- 
der the FDC [Fig. |2ja)] practically doesn't change if we 
smooth out the curve, the area under the FEC [Fig.[2];b)] 
is heavily dependent on the fluctuations of the extremal 
points x\ and Xi (see also F ig. [6| . We will have more to 
say about this point in Sec. 

In the derivation of Eq. ( 25 1 we have made use of three 
approximations : 



• we discarded all the contributions to the sum of 



exponentials in Eqs. (18 1 and (19 1 except the one 



coming from the minimum-work trajectory; 

• we assumed a normal distribution for {W^/}; 

• we assumed a Gumbel distribution for {W^}. 

Although each one of them seems reasonable, it is not 
redundant, before discussing the experimental utility of 
Eq. ( 25 1 , to check the final result against a numerical 
test. 

D. A numerical test 



In order to validate Eq. 
numerical simulation of Eq. 
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FIG. 5: Experimental test of Eq. (251. In abscissa, a is the 
standard deviation of the transferred work values {VK/}; dif- 
ferent values of s' are obtained by varying the stiffness of the 
trap and the bandwidth. In ordinate, the error AGo — AGq 
on the determination of the hairpin energy levels due to the 
erroneous use of W' in the Jarzynski estimator. See Tab. [l] 
for further details about the data. 



thousands of curves like the two represented in Fig. |2] 
The effect of the instrumental bandwidth has been mim- 
icked by applying different filters to the data, so that each 
point of the FDC or FEC represents actually an average 
over m consecutive integration steps. In this way we have 
generated data in a fair range of values of s' . The results 
are illustrated in Fig. |4] 

The first observation is that the error can be very large: 
as much as 50 fceT' in a system where the true AGq is 
57.7 /cbT, that amounts to a relative error not far from 
100%. Then we observe that, in spite of the somewhat 
rough simplifications used in its derivation, the analytical 
prediction of Eq. ( 25 1 fares reasonably well in the com- 



parison with the simulated data, although there seems 
to be a small apparently systematic underestimation of 
AGq — AGq. Finally, a comment about the range of s': 
The standard deviation of { VF/} is a linear function of the 
ampUtude of the fluctuations of x, given by Eq. (A16l; 



this fixes an upper limit to the range of s' that can be 
explored without changing the system. 



III. AN EXPERIMENTAL TEST 

This section reports the results of an experimental test 
of Eq. 



(251 



|25|), we have performed a 
(|9|, generating hundreds of 



whose theoretical derivation has been pre- 
sented in Sec. [ll] The instrument we employed is a dual- 
beam miniaturized optical tweezers with fiber-coupled 
diode lasers (845 nm wavelength) that produce a piezo 
controlled movable optical trap and m easure force using 
conservation of light momenturrPHEI] The molecule is 
a DNA hairpin of sequence 5'-GCGAGCCATAATCTC- 
ATCTGGAAACAGATGAGATTATGGCTCGC-3' hy- 
bridized to two double-stranded DNA (dsDNA) handles 
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(29 base-pairs long) . Pulling experiments were performed 
at 25 °C in a buffer containing Tris H-Cl pH 7.5, 1 M 
EDTA and 1 M NaCl. The data that we show (see Tab.|l]) 
have been measured from 7 specimens in hundreds of 
stretching-releasing cycles performed at pulling speed of 
200 nm/s (equivalent to a loading rate of 13.8 p N/s). The 
use of DNA hairpins presents several advantage^^HsHMUl] 
over the RNA hairpi ns tha t were used in pioneering ex- 
periments of this kincP^I^H 

In order to measure the dependence of AGq — AG'q 
on the bandwidth, we employed a fast analog-to-digital 
converter that makes possible to increase the data acqui- 
sition frequency from the standard value of 1 kHz to as 
much as 100 kHz (20 kHz, however, is larger than the 
corner frequency of the bead, around 10 kHz, and proved 
to be enough for this test). The availability of high- 
frequency data is a good start, but is not enough with- 
out a data analysis procedure that carefully preserves the 
statistical properties of the boundary term [see Eq. (14l]. 
Here are the main steps of the data analysis that we per- 
formed: 

1. The stream of data is split into single unfolding or 
refolding events. 

2. Taking advantage of the fact that the elastic re- 
sponse of the short dsDNA handles is with a good 
approximation Hookean, we fit the FDC folded and 
unfolded branches with straight lines. 

3. The unavoidable small instrumental drift (which is 
manifested in the unphysical increasing or decreas- 
ing of the measured value of the trap positon A) is 
corrected by shifting the FDC in such a way that 
the straight line fitting the folded branch crosses 
A = at the same value of the force in any event. 

4. The FDCs are integrated between two fixed values 
Ai and Af. These integrations produce two sets of 
accumulated work values {VFi}: one for the unfold- 
ing and one for the refolding process. 

5. Each FEC is integrated between Xi{T) = Ai — 
/i(r)/fcb and 2;f(r) = Af - /f(r)/fcb; note that, 
while Ai and Af are the same for all trajectories, 
/i and /f depend on the trajectory F, and so do Xi 
and X{. In this way we obtain two sets of trans- 
ferred work values {VF/}: again, one for unfolding 
and one for refolding trajectories. 

6. The Jarzynski estimators W and W are computed 



by means of Eqs. ( 18 ) and ( 19 1, and then Eqs. ( 15 ) 
and (ITtI) give AGq and AG^ 



Table |l] shows that Eq. ( 25 1 is generally quite close to 
the experimental results, most of the times predicting a 
discrepancy between AGq and AG'q within few ksT of 
the observed value. The occasional large deviations be- 
tween theory and experiment shouldn't be too surprising 
in view of the statistical nature of the quantity we are 
measuring and the approximate derivation of Eq. ( 25 1 . 



The data reported in Tab. [T can be graphically rep- 
resented in analogy with Fig. 4] In principle, we ex- 
pect each dataset to be represented by a slightly different 
straight line, as the number of trajectories n varies from 
a minimum 143 to a maximum 635 (see Tab.|l]). However, 
in practice the differences are small enough that all the 
theoretically expected values are very close to the line 
that in Fig. [5] is denoted as "analytical approximation". 

Figure [6] shows a typical trajectory plotted as FDC 
and FEC, using 20 kHz and 1 kHz data. It can be imme- 
diately appreciated that, while the area under the FDC 
is insensitive to the sampling frequency, the area under 
the FEC may display important differences due to the 
fluctuations of the integration extrema. 



A. Bi-directional methods 

If the experimental situation makes it possible to im- 
plement not only the protocol X{t), but also the time- 
reversed protocol X{t) = X{At — t), where At = tf — ti 
is the duration of the experiment, then a more efficient 
way of estimating free ene rgy differences is to apply a bi- 
directional methocPHSHSZl^ which takes advantage of the 
knowledge of both a "forward" and a "reverse" work dis- 
tributions. Bi-directional methods are based on another 
fluctuation relation, the Crooks theorerrP^l 



exp 



,-AG 



(26) 



where 4'w-foiA'^) {4'w-a-E.vi'^)) the probability density 
function of the work along the forward (reverse) process. 
Also the Crooks theorem, like the Jarzynski equality, is 
written for the accumulated work W . Writing an ana- 
lytical approximation of the error introduced by the er- 
roneous use of the transferred work W' , in the style of 
what we did in Secjll] looks quite more complicated, but 
a direct evidence of the role of the bandwidth is given in 
Fig. [t) where \og[(j)WYon{'^) / 4>w^e^,{-w)] is plotted as a 
function of {w — AG) / {k-oT) for two values of the band- 
with. 

The experimental results are summarized in Tab. |ll] 
Even if the Crooks theorem is not satisfied, the estimate 
of A Go that we get by blindly substituting in Eq. (26 1 



the transferred work W' for the accumulated work W is 
not as bad as the one obtained by using the Jarzynski 
equality. 



B. Role of the data analysis technique 



IIT|may be 
(|25|, but is 



The data analysis protocol detailed in Sec 
the best suited to the task of verifying Eq. 
not feasible if one's experimental setting only provides 
access to the transferred work W' (and makes it diffi- 
cult to accurately estimate the stiffness fcb of the trap) . 
If this is the case, then one either employs a version of 
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TABLE I: Experimental results: Comparison between the experimental (also shown in Fig. [5| and the theoretical (based on 
Eq. ( |25[ )) values of (AGo — AGo)/(feBr). The datasets labeled "1 kHz" and "20 kHz" refer to the same experiment, with the 
standard (low-frequency) and the new (high-frequency) data acquisition system. The stiffness of the trap kh is measured in 
pN/pm, while n is the number of trajectories. 
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FIG. 6: (a) An experimental force-distance curve (FDC) observed with a high-frequency (20 kHz) and a low-frequency (1 
kHz) data acquisition system. The area under the curve, which is a measure of the accumulated work W, practically doesn't 
change, (b) The force-extension curve (FEC) associated to the pulling experiment represented in Fig. [6]; a). The area under 
the curve, which represents the transferred work W' , depends on the frequency of the data acquisition system because of the 
large fluctuations of the integration extrema. Insets: magnified views of the region around the maximum of the force. 



TABLE II: Experimental results. The datasets labeled "1 kHz" and "20 kHz" refer to the same experiment, with the standard 
(low-frequency) and the new (high-frequency) data acquisition system. The datasets labeled "ave n" are obtained from 20 kHz 
data by averaging over n points. 
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FIG. 

and low-frequency data, accumulated and transferred work. 
Data have been shifted along the horizontal axis to be eas- 
ily compared. Data for the accumulated work (circles and 
squares) fall into a (bandwidth-independent) straight line of 
slope 1.00(8) in quantitative agreement with the prediction 
by the fluctuation relation Eq. (261. However data for the 
transferred work (triangles and rhombs) exhibit bandwidth- 
dependent very small slopes (around 0.03) that exclude the 
validity of an equivalent relation to Eq. ( 26 1 for the trans- 
ferred work. 



free energy differences from these out-of-equilibrium pro- 
cesses apply to the work W, but not to the work W' . In 
this paper we quantified how large an error is likely to 
affect the estimate of the free energy at zero force AGq of 
the molecule if W is erroneously replaced with W'. We 



found an analytical approximated expression [Eq. (25 1] 
that emphasizes the role of the data analysis procedure 
and of the bandwidth of the data acquisition system. We 
confirmed the validity of this approach by both numeri- 
cal simulation of a toy model and experiments on a DNA 
hairpin. This work should resolve some issues about the 
proper way to measure work in single-molecule exper- 
iments that have generated discussion and controversy 
over the past years. 
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APPENDIX A: THERMODYNAMICS OF THE 
TOY MODEL 



the fluctuation theorem written for W' (as in the already 
cited Ref. [T5|, or uses W in Eq. (jlj, but takes care of 
minimizing the error on the determination of AGq, ap- 



proximately given by Eq. (251. For example, the folded 
and unfolded branches of the FEC can be smoothed (by 
application of a filter, by spline-fitting, etc.) until the 
variance of {W/} is entirely due to the distribution of the 
breaking point, in which case the difference between AGq 
and AGq becomes negligible compared to other sources of 
experimental error. This is the reason why both Refs. 
and [24] obtained an acceptable experimental test of the 
Jarzynski equality and the Crooks theorem, respectively, 
even if erroneously using the transferred work. 



IV. CONCLUSION 

The output of a single-molecule pulling experiment 
can be graphically represented in the form of a force- 
extension curve, where the externally applied force is 
compared to the molecular construct end-to-end dis- 
tance, or a force-distance curve, where the same force is 
represented against the physical control parameter, the 
length that can be directly manipulated by the experi- 
menter. The area under the former curve is the work W' 
transferred to the molecule subsystem, while the latter 
curve allows the measurement of the accumulated work 
W, the total amount of work expended on the whole sys- 
tem (experimental apparatus included). 

The fluctuation theorems commonly used to compute 



The model defined in Sec. |TT]is simple enough to allow 
the analytical solution of its equilibrium thermodynam- 
ics. The partition function of the system is 



+ 00 



dx exp 



-/3i/(^)(x,0l ' (Al) 



where the Hamiltonian is given by Eq. The inte gra- 
tion is trivial, so we can immediately write the solution 



Z(A) = Zo(A)-HZi(A) 



where 



exp 



<r/3AGo 



(A2) 



(A3) 



Given the partition function, we have access to all the 
thermodynamic properties of the model; the Gibbs free 
energy, in particular, is defined as 



G(A) = -~kBTlnZ{\). 
and the TFDC is given by 



(A4) 



(/>(A) = 



where 



dGjX) 
dX 



= fceff [A - Pi{X)£i - Po(A)4] , (A5) 



Z{X) 



(A6) 
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is the probability of the state c for a fixed value of A. 
The coexistence value Ac is characterized by the fact that 
PoiK) = PiiK), hence 



A. 



AGo 



The corresponding coexistence force is 

AGo 



/c^(/>(Ac) 



(A7) 



(A8) 



Notice that in the asymptotic region A ^ Ac the proba- 
bility of the open state is negligible, so the force goes as 
^off(A — ^o)j while in the region A 3> Ac it is the probabil- 
ity of the closed state that goes to zero, leaving a force 
dependence of th e form fceff(A — fi). 

From Eq. ( A5 1 we can easily write down the reversible 
work 



W^rcv= 1^ (/)(A)dA. 



(A9) 



The integration can be done analytically using the fact 
that 



7- ^ X - - ln(a + e"^) 



(AlO) 



Some tedious algebraic manipulation is required before 
one can write for the reversible work the following exact 
formula: 



AGo + ^[(Af 



L)'-(Ai-4)']-C, (All) 



where G is a correction very small if Aj ^ Ac ^ Af (that is 
the most common experimental condition) whose explicit 
form is 

^^1, l + exp[-/3fccff(4-4)(Af-Ac)] 

/3 l + exp[-/3fceff(^i-4)(Ac-A0] ■ ^ ^ 

In practice, Ai ^ Ac ^ Af so one can usually forget about 
G and use Eq. (A5l to rewrite Eq. ( All[ ) as 



AGo- 



{!)} - ifYt 
2fccff 



(A13) 



from which we easily obtain the variance for the equilib- 
rium fluctuations of x 



{SxYiX) = + Po(A)Pi(A)^(^i - eoY ■ (A16) 

The variance for the equilibrium fluctuations of the force 
are simply related to those of x: 



i6f)\X)^kliSx)\X). 



(A17) 



APPENDIX B: AN EXERCISE IN ORDER 
STATISTICS 

Let {Yi} be n independent, identically distributed real- 
valued random variables with cumulative density func- 
tion (cdf) <i>(y) = Pi{Yi < y). The probability den- 
sity function (pdf ) is defined as the derivative of the cdf: 
(/)(?/) = ^'{y). The pdf has the property (t>{y)dy ~ Pr(y < 
y^ <y + dy). 

The minimum Y(i-) of the set {Yi} is itself a random 
variable whose distribution can be deduced from the 
knowledge of (j){y) and <&(?/). Indeed, the probabihty 
*^y(i)(y) that the minimum is no more than y is equal 
to the probability of having at least one Yi < y. This is 
given by the binomial distribution as 



<i>y,„(y) = l-[l-a>(y)r 



(Bl) 



Differentiating with respect to y we find the correspond- 
ing pdf 



0y<,,(y) = n[l-<i>(y)]"-V(y). 



(B2) 



The simplest way to characterize the most likely value 
of Y(i) is to consider the mode, that is the point where 
the pdf has a maximum. This is given by solving with 
respect to y the following equation: 



[l-<i>(y)]0'(y) = (n-l)02(y). 



(B3) 



In the rest of this section, we specialize these general 
formulas to the two distributions we used to describe the 
statistical behavior of the accumulated and transferred 
work. 



where (/)i = (/)(AO and (/)f=(/)(Af). 

The expectation value of the molecular extension is 

{x){X) = ^A+ ^ [Pi(A)4 + Po(A)4] . (A14) 

This equation can be rephrased into an expression for the 
TFEC. 

Another interesting quantity is the expectation value 
of x^, 



1. Normal distribution 



A normally distributed variable of mean fi and variance 



a is described by the cdf 



ci,(N)(y) 

from which derives the pdf 



2 2 V \/2(T 



<^(N)(2,) = 



1 



crV27r 



exp 



(y - m)' 



2(72 



(B4) 
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FIG. 8: Comparison between the histogram of the transferred 
work in one of the experiments reported in Tab. |l] and the 
normal distribution that better approximates it. 



The distribution of the transferred work W' is often well 
described by a normal distribution (see Fig. [8|. It is 
convenient to define the reduced variable 



(B6) 



in terms of which we can write the cdf of the minimum 
Y(i) of a sample of size n 



and its pdf 



4 (z)=^exp(-z2) [i + ierf(z)]' 
^ ' (Tv27r 



(B7) 



(B8) 



The mode of the distribution d)'^'^ (z) is the solution to 
the following transcendental equation: 

V^z[l + erf(z)] = (n- l)exp(-z2) . (B9) 

The numerical solution for n < 10 000 is plotted in Fig. |3] 

2. Gumbel distribution 

In both our simulations and experiments, we find that 
the accumulated work is often adequately represented 



(see Fig. [9]) by a random variable obeying the Gumbel 
distribution 



ci>(G)(y) = l 



exp 



- exp 



y-a 



(,(G)(y) = iexp (^^^ )exp 



exp 



(BIO) 



(Bll) 
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FIG. 9; Comparison between the histogram of the accumu- 
lated work in one of the experiments reported in Tab. [l] and 
the Gumbel distribution that better approximates it. 



The parameters a and h can be quickly estimated from 
the average y and the standard deviation s of the sample 
{Yi\ by means of the formulas 



76 



a = y + 76 , 



(B12) 



where 7 is the Euler-Mascheroni constant 0.5772. . . The 
minimum value Y(x) over the sample is in this case dis- 
tributed with pdf 



-nexp 



y-a 



(B13) 

The mode of the minimum is therefore given simply by 
a — 6 log n. 
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